possible move
On the Limits of Innate Planning in Large Language Models
Schepanowski, Charles, Ling, Charles
Large language models (LLMs) achieve impressive results on many benchmarks, yet their capacity for planning and stateful reasoning remains unclear. We study these abilities directly, without code execution or other tools, using the 8-puzzle: a classic task that requires state tracking and goal-directed planning while allowing precise, step-by-step evaluation. Four models are tested under common prompting conditions (Zero-Shot, Chain-of-Thought, Algorithm-of-Thought) and with tiered corrective feedback. Feedback improves success rates for some model-prompt combinations, but many successful runs are long, computationally expensive, and indirect. We then examine the models with an external move validator that provides only valid moves. Despite this level of assistance, none of the models solve any puzzles in this setting. Qualitative analysis reveals two dominant deficits across all models: (1) brittle internal state representations, leading to frequent invalid moves, and (2) weak heuristic planning, with models entering loops or selecting actions that do not reduce the distance to the goal state. These findings indicate that, in the absence of external tools such as code interpreters, current LLMs have substantial limitations in planning and that further progress may require mechanisms for maintaining explicit state and performing structured search.
- North America > United States > Virginia (0.04)
- Europe > France (0.04)
Learning a Prior for Monte Carlo Search by Replaying Solutions to Combinatorial Problems
Monte Carlo Search gives excellent results in multiple difficult combinatorial problems. Using a prior to perform non uniform playouts during the search improves a lot the results compared to uniform playouts. Handmade heuristics tailored to the combinatorial problem are often used as priors. We propose a method to automatically compute a prior. It uses statistics on solved problems. It is a simple and general method that incurs no computational cost at playout time and that brings large performance gains. The method is applied to three difficult combinatorial problems: Latin Square Completion, Kakuro, and Inverse RNA Folding.
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Leisure & Entertainment > Games (0.94)
Generalized Nested Rollout Policy Adaptation with Limited Repetitions
Generalized Nested Rollout Policy Adaptation (GNRPA) is a Monte Carlo search algorithm for optimizing a sequence of choices. We propose to improve on GNRPA by avoiding too deterministic policies that find again and again the same sequence of choices. We do so by limiting the number of repetitions of the best sequence found at a given level. Experiments show that it improves the algorithm for three different combinatorial problems: Inverse RNA Folding, the Traveling Salesman Problem with Time Windows and the Weak Schur problem.
- Leisure & Entertainment > Games (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.52)
- Information Technology > Artificial Intelligence > Games > Go (0.47)
Nested Search versus Limited Discrepancy Search
Limited Discrepancy Search (LDS) is a popular algorithm to search a state space with a heuristic to order the possible actions. Nested Search (NS) is another algorithm to search a state space with the same heuristic. NS spends more time on the move associated to the best heuristic playout while LDS spends more time on the best heuristic move. They both use similar times for the same level of search. We advocate in this paper that it is often better to follow the best heuristic playout as in NS than to follow the heuristic as in LDS.
U.S. battles Putin by disclosing his next possible moves
Washington – After decades of getting schooled in information warfare by President Vladimir Putin of Russia, the United States is trying to beat the master at his own game. In recent weeks, the Biden administration has detailed the movement of Russian special operation forces to Ukraine's borders, exposed a Russian plan to create a video of a faked atrocity as a pretext for an invasion, outlined Moscow's war plans, warned that an invasion would result in possibly thousands of deaths and hinted that Russian officers had doubts about Putin. Then, on Friday, Jake Sullivan, President Joe Biden's national security adviser, told reporters at the White House that the United States was seeing signs of Russian escalation and that there was a "credible prospect" of immediate military action. Other officials said the announcement was prompted by new intelligence that signaled an invasion could begin as soon as Wednesday. All told, the extraordinary series of disclosures -- unfolding almost as quickly as information is collected and assessed -- has amounted to one of the most aggressive releases of intelligence by the United States since the Cuban missile crisis, current and former officials say.
- North America > United States (1.00)
- Asia > Russia (1.00)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.28)
- (5 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Regional Government > Europe Government > Russia Government (1.00)
- Government > Regional Government > Asia Government > Russia Government (1.00)
- Government > Military (1.00)
Generalized Nested Rollout Policy Adaptation with Dynamic Bias for Vehicle Routing
Sentuc, Julien, Cazenave, Tristan, Lucas, Jean-Yves
In this paper we present an extension of the Nested Rollout Policy Adaptation algorithm (NRPA), namely the Generalized Nested Rollout Policy Adaptation (GNRPA), as well as its use for solving some instances of the Vehicle Routing Problem. We detail some results obtained on the Solomon instances set which is a conventional benchmark for the Capacitated Vehicle Routing Problem with Time Windows (CVRPTW). We show that on all instances, GN-RPA performs better than NRPA. On some instances, it performs better than the Google OR Tool module dedicated to VRP.
- Europe > France (0.04)
- North America > United States > Florida > Miami-Dade County > Miami Beach (0.04)
- Europe > Netherlands > South Holland > Leiden (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- Transportation > Freight & Logistics Services (0.94)
- Leisure & Entertainment > Games (0.93)
Improving Minimax performance
The Minimax algorithm, also known as MinMax, is a popular algorithm for calculating the best possible move a player can player in a zero-sume game, like Tic-Tac-Toe or Chess. It makes use of an evaluation-function provided by the developer to analyze a given game board. During the execution Minimax builds a game tree that might become quite large. This causes a very long runtime for the algorithm. In this article I'd like to introduce 10 methods to improve the performance of the Minimax algorithm and to optimize its runtime.
Gomoku: analysis of the game and of the player Wine
Piazzo, Lorenzo, Scarpiniti, Michele, Baccarelli, Enzo
Gomoku, also known as five in a row, is a classical board game, ideally suited for quickly testing novel Artificial Intelligence (AI) techniques. With the aim of facilitating a developer willing to write a new Gomoku player, in this report we present an analysis of the main game concepts and strategies, which is wider and deeper than existing ones. Moreover, after discussing the general structure of an artificial player, we present and analyse a strong Gomoku player, named Wine, the code of which is freely available on the Internet and which is an excelent example of how a modern player is organised.
- North America > United States > New York (0.04)
- North America > United States > California (0.04)
- Europe (0.04)
- Asia > East Asia (0.04)
How AI Revolutionised the Ancient Game of Chess
I have come to the personal conclusion that while all artists are not chess players, all chess players are artists. Originally called Chaturanga, the game was set on an 8x8 Ashtāpada board and shared two key fundamental features that still distinguish the game today. Different pieces subject to different rules of movement and the presence of a single king piece whose fate determines the outcome. But it was not until the 15th century, with the introduction of the queen piece and the popularization of various other rules, that we saw the game develop into the form we know today. The emergence of international chess competition in the late 19th century meant that the game took on a new geopolitical importance.
Finding optimal strategies in sequential games with the novel selection monad
The recently discovered monad, Tx = Selection (x -> r) -> r, provides an elegant way to finnd optimal strategies in sequential games. During this thesis, a library was developed which provides a set of useful functions using the selection monad to compute optimal games and AIs for sequential games. In order to explore the selection monads ability to support these AI implementations, three example case studies were developed using Haskell: The two-player game Connect Four, a Sudoku solver and a simplified version of Chess. These case studies show how to elegantly implement a game AI. Furthermore, a performance analysis of these case studies was done, identifying the major points where performance can be increased.